feat(expert): human-in-the-loop tool permissions (#421)#7639
feat(expert): human-in-the-loop tool permissions (#421)#7639andypalmi wants to merge 53 commits into
Conversation
…ion in MCP server tests
…crash Add an ErrorBoundary and wrap each answer item in it, so a failure in one section degrades only that section instead of blanking the whole message. Also guard the optional streamable chain in StandardResourceCard that could throw on a null value and take down the message.
) The Expert can ask 1-4 clarifying questions in a single turn, each rendered as its own single- or multi-select option card; all answers are collected before the turn is submitted. Answered cards can be edited and resubmitted, and a card from a past turn is disabled once a newer message arrives. Adds a follow-up-questions cadence setting (all at once vs one at a time) in the composer settings menu, shipped to the agent via the expert context.
Add an always-visible Plan mode toggle to the composer. When enabled, the Expert proposes a plan instead of making changes, rendered as a plan card with Approve, Edit, Request changes and Reject actions: - Approve exits plan mode and proceeds with the plan. - Edit loads the plan markdown into the composer for direct editing. - Request changes focuses an empty composer to describe a change in words. - Reject abandons the plan. The plan card renders its markdown through RichContent (passing the message and answer uuids it requires), and reuses the composer's pending-input and auto-grow behaviour. Plan mode and the approval signal are shipped to the agent via the expert context.
Plan mode is only meaningful inside the instance/device editor for now, so gate the composer toggle on immersive mode and force the persisted planMode off whenever the user is outside immersive (including on load), preventing a stale value from being sent in non-immersive contexts.
- Guard the optional streamable chain in FlowResourceCard directly instead of relying on a render boundary to mask the throw. Reduce ErrorBoundary to a single last-resort backstop per answer item in AiMessage; drop the per-section boundary wrappers in AnswerWrapper. - Rewrite QuestionsList on top of the existing ff-radio-group (single-select) and ff-checkbox (multi-select) components so options look like standard, clickable form controls and stay consistent with the rest of the app. - Replace the imperative growComposerToContent DOM measuring with CSS field-sizing on the textarea; drop the manual reflows and the auto-grown flag. The composer auto-sizes to content and pins to an explicit height only after a drag-resize.
Per review: the local-catch pattern was the only one in the frontend; the rest of the app leans on the global app.config.errorHandler. The real throws are now guarded at their source (the optional streamable chains in the resource cards), so the boundary was redundant. Remove it entirely and let genuinely unexpected render errors surface through the global handler like everywhere else.
Replace the composer kebab menu with a settings gear that opens an ff-dialog. The follow-up-questions cadence control now lives in the dialog as an ff-radio-group, with a FormHeading per section so the panel can grow as more settings are added.
…n-mode # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue # frontend/src/components/expert/components/messages/components/AnswerWrapper.vue
Add per-tool approval for the Expert's flow-building tools in the immersive editor. The agent gates each tool call at the toolsNode seam by class (read/write/delete) and per-tool preference; write/delete default to Ask and surface an inline approval card (Allow / Always allow / Never) that holds the call open with no session timeout, while read defaults to allow. - Catalog delivered over HTTP (GET /api/v1/expert/mcp/tools), curated to friendly names so raw tool identifiers never reach the browser; a per-response hash triggers a background refetch when the catalog drifts. - HITL state consolidated into the product-assistant store (defaults, per-tool preferences, pending-approval map) with SemVer version gating. - Settings panel groups versioned tool variants into one family and points update hints at the newest variant's required version. - Role inheritance is fail-closed: read-only members cannot enable or trigger write/delete tools and are shown why.
Use FormHeading for the section titles and ff-data-table for both the action-type defaults and the flow-building tool list, replacing the bespoke section/group styling and the non-standard uppercase scope headers. Bordered table rows pair each tool with its permission control across the row rather than leaving them to float across whitespace; tool scope moves into a Type column. The approval card no longer sends or renders a tool summary; the tool name, scope and call parameters describe the action.
8a97b8e to
bdb36fb
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7639 +/- ##
==========================================
+ Coverage 75.35% 75.37% +0.02%
==========================================
Files 425 425
Lines 22487 22518 +31
Branches 5930 5945 +15
==========================================
+ Hits 16944 16973 +29
- Misses 5543 5545 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…nd platform tools Fetch the tool catalog when the Expert panel mounts (not only in the editor) so the permissions settings render wherever the Expert is. Split the settings into a Flow Building Tools section, with its own per-action-type default permissions, and a separate FlowFuse Platform Tools section (a placeholder until those tools ship, with TODOs marking where they get mapped in). Flow-building tools are listed everywhere but noted as usable only from an instance editor.
- Show plain Read / Write / Delete scope instead of phrases like Read only - Stop the Setup Guide badge rendering above the approval card - Disable the action buttons as soon as a choice is made
handleMessageResponse lived in the composer's send handler, so a reply to a query sent from the questions card was fetched but never rendered over HTTP (without comms-beta the reply only renders via the MQTT push handler). Fold the response handling into handleQuery so every entry point (composer and the question/plan cards) renders the reply without re-implementing it.
…n-mode # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue # frontend/src/stores/product-expert.js
…rmissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue
Raise the conversation-history expiry from 28 to 30 minutes (warning at 27), so the human-in-the-loop tool-approval wait, which is bounded by the session lifetime, has the full 30-minute window the agent now allows.
| for (const m of this._agentStore.messages) { | ||
| if (!Array.isArray(m.answer)) continue | ||
| for (const a of m.answer) { | ||
| if (a.kind === 'tool-approval' && a.status === 'pending') a.status = 'denied' |
There was a problem hiding this comment.
Couldn't test this without the agent side, but reading the code: on Stop, cancelPendingToolApprovals sets status='denied' on the store answer, but the card renders a shallow copy of it useStreamingList({ shallow: true }), so its status prop never update. Worth confirming, but looks like Stop won't resolve an open approval card.
There was a problem hiding this comment.
Good catch, confirmed. The card renders a detached streaming copy of the answer (AiMessage uses useStreamingList with shallow: true), so writing the status onto the store message never reached it. On Stop the buttons stayed live.
Fixed by recording the outcome in a reactive per-id map (toolApprovalStatuses) on the product-assistant store. AnswerWrapper now feeds the card its status from that map, so an external resolution (Stop / Start Over) updates a card the user never pressed. localStatus stays for instant feedback on the user's own press. Added store and product-expert tests covering the denied-on-cancel path.
|
|
||
| // A new chat drops the per-session tool grants ("Always allow/deny for this chat"). | ||
| useProductAssistantStore().clearSessionToolOverrides() | ||
|
|
There was a problem hiding this comment.
Do we want a cancelPendingToolApprovals() here too?
There was a problem hiding this comment.
Yes. startOver now calls cancelPendingToolApprovals() first, so any approval still awaiting a decision resolves (as denied) and the agent's paused tool call unblocks instead of hanging on a message we are about to drop. It also clears toolApprovalStatuses alongside the session overrides.
n-lark
left a comment
There was a problem hiding this comment.
Hey so I cannot test the approve/deny part of this in that chat due to the staging env not having posthog synced up. The permissions page under settings UI looks fine to me but I don't feel comfortable approving this since I cannot test and am unfamiliar with this feature. I'd recommend @cstns or @Steve-Mcl to takes a look.
…421) Add automated coverage for the human-in-the-loop tool-permission work: - forge GET /mcp/tools: auth (401 instance/device), team-access (404), missing teamId (400), catalog+hash proxy, empty-response defaults and upstream error propagation. - product-assistant store: permission-resolution engine (class/group helpers, per-team defaults, per-tool and session overrides, resolved permissions, version gating, and the pending-approval registry). - product-expert store: catalog fetch and the approval round-trip (session short-circuit, resolve/always-allow/always-deny, cancel).
The approval card renders a detached streaming copy of its answer, so writing a resolved status onto the store message never reached it. On chat stop the card stayed on its Allow/Deny buttons even though the pending call had been denied. Record approval outcomes in a reactive per-id map on the product-assistant store and have AnswerWrapper feed the card its status from that map, so an external resolution (chat stop / Start Over) updates a card the user never pressed. Start Over now also cancels open approvals and clears the map.
|
Thanks for the review. Pushed a fix for the Stop issue you spotted. Root cause: the approval card renders a detached streaming copy of its answer (AiMessage uses useStreamingList with shallow: true), so a status written onto the store message never reached the card. Clicking Allow/Deny worked only because the card tracks its own localStatus; the external Stop path had no way in, so the buttons stayed live. Fix: approval outcomes are now recorded in a reactive per-id map (toolApprovalStatuses) on the product-assistant store, and AnswerWrapper feeds the card its status from that map. That covers external resolutions, Stop and Start Over, on a card the user never pressed. Start Over also cancels open approvals first so the paused tool call unblocks. This is in-memory session state only, not persisted, same lifecycle as the session overrides. Added store and product-expert tests for the denied-on-cancel path. On testing: understood you cannot exercise approve/deny on staging without the agent side synced. @cstns or @Steve-Mcl, a second look would be welcome given the reactivity change. |
#7598) Co-authored-by: Steve-Mcl <sdmclaughlin@gmail.com> Co-authored-by: Stephen McLaughlin <44235289+Steve-Mcl@users.noreply.github.com> Co-authored-by: Andrea Palmieri <76187074+andypalmi@users.noreply.github.com> Co-authored-by: andypalmi <andrea@flowfuse.com>
…omationsHandler integration # Conflicts: # frontend/src/stores/context.js
…421) Curate the FlowFuse platform automation tools from the handler singleton (app.comms.platformAutomation) into the /mcp/tools catalog alongside the flow-building tools, tagged group:'platform' so the UI routes them to their own section with their own read/write/delete defaults. Read/write/delete class is derived from each tool's MCP annotations; platform tools carry no nr-assistant version window.
Replace the mid-turn approval round-trip with a stateless defer/resume flow so the agent never stays resident waiting on a human. When a turn needs approval it returns the approval card(s) and ends; the browser collects the decisions and sends them back in one resume message that continues the turn. - product-expert: track the open approval batch, resume once every card is answered, transport-agnostic (MQTT push or awaited HTTP reply). - product-assistant: drop the promise-based pending-approval registry; the store now only records per-card outcome statuses and session grants. - tests updated for the batch model.
…permissions # Conflicts: # frontend/src/stores/context.js # test/unit/forge/routes/auth/permissions_spec.js
Temporary console traces to diagnose why a support-agent request takes the HTTP path instead of MQTT. Logs the resolved feature checks that drive shouldUseMqtt (platform external-broker flag, team-type broker flag, and the combined value) plus the chosen transport per send.
…permissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue # frontend/src/components/expert/components/messages/components/AnswerWrapper.vue # frontend/src/stores/context.js # frontend/src/stores/product-expert.js
…a session grant The individual-tools table sized the tool-name column to its content, so promoting a per-chat session grant to a saved default removed the inline session note and its action, shrinking that column and sliding the Type (scope) column left. Let the tool-name column absorb the free space so the Type and Permission columns stay pinned right regardless of the note.
Platform and UI automation tools now carry a top-level title used as their human-friendly label. The title is forwarded on the wire definitions and MCP registration, and the platform catalog entry prefers it over the name-derived label.
…lity Send a supportsHITL flag in the Expert context. Instances predating the human-in-the-loop tool permissions (#421) omit it, letting the agent fall back to running flow-building tools at every scope and gating platform write/delete tools instead of treating the user as read-only.
|
|
||
| /** | ||
| * Retrieve the curated tool catalog for the Expert's human-in-the-loop permissions UI | ||
| * (#421). Returns the merged catalog for both sections the UI shows: |
|
|
||
| /** | ||
| * Maps a platform automation tool's wire definition into a catalog entry for the | ||
| * Expert permissions UI (#421). Platform tools carry standard MCP annotations |
| const platformHandler = app.comms?.platformAutomation | ||
| if (platformHandler) { | ||
| const platformDefs = platformHandler.getToolDefinitions() || [] | ||
| catalog.push(...platformDefs.map(curatePlatformTool)) |
There was a problem hiding this comment.
if we add titles on the mcp tools, do we need to map these?
| }, | ||
| // Size of the underlying buttons (passed through to ff-button). Defaults to | ||
| // 'medium' to match existing usages; 'small' suits dense contexts like tables. | ||
| size: { |
There was a problem hiding this comment.
the prop name is self explanatory, adding possible options should be enough
| @@ -0,0 +1,170 @@ | |||
| <template> | |||
| <div class="json-viewer"> | |||
There was a problem hiding this comment.
we should scope this as a generic component otherwise we'll end up replicating it. Please move it under frontend/src/components. It's not as abstracted as i would have liked but we can do that later on
| confirm-label="Done" | ||
| :can-be-canceled="false" | ||
| data-el="expert-settings-dialog" | ||
| boxClass="max-w-[54rem]!" |
| <ff-data-table-row v-for="tool in group.tools" :key="tool.familyKey"> | ||
| <ff-data-table-cell class="tool-col"> | ||
| <div class="tool-permissions__cell"> | ||
| <span class="tool-permissions__title">{{ tool.displayName }}</span> |
There was a problem hiding this comment.
this will be hard to maintain going forward. Encapsulating smaller self-contained vue components add not only ease of maintenance but also local state that can be managed at the component level, allowing you to avoid the sausage types of methods that are within this component.
| questionCadence: useProductExpertStore().questionCadence, | ||
| planMode: useProductExpertStore().planMode | ||
| planMode: useProductExpertStore().planMode, | ||
| // Signals that this FlowFuse version implements human-in-the-loop tool |
There was a problem hiding this comment.
no need for such a long winded note, more concise
|
|
||
| const MAX_DEBUG_LOG_ENTRIES = 100 // maximum number of debug log entries to keep | ||
|
|
||
| // --- Expert tool permissions (human-in-the-loop, #421) ----------------------- |
There was a problem hiding this comment.
these feel like they belong in a helper
| // Expert tool permissions (HITL, #421). The catalog + hash are refreshed from | ||
| // the agent; defaults + preferences are the user's choices (persisted below). | ||
| toolCatalog: [], | ||
| toolCatalogHash: null, |
There was a problem hiding this comment.
these are expert concerns not assistant, or am i wrong?
Human-in-the-loop tool permissions for the Expert
Implements per-tool human-in-the-loop permissions for the Expert's flow-building tools, in the immersive editor, as described in FlowFuse/product#421. The builder (and their team role) controls which flow-building actions the Expert may run, which need approval, and which are off limits, so it never makes a change they would not have allowed.
Stacked on #7635 (
feat/408-expert-plan-mode), which is the base of this PR and should merge first.What it does
Architecture
toolsNodeseam (sibling of the plan-mode gate): role check first, then per-tool policy.allowruns,denyfeeds the denial back to the model so it adapts and explains,askpublishesexpert:tool-approvaland awaits the browser's decision.GET /mcp/flow-toolsendpoint (friendly name + scope + version window only). Forge exposesGET /api/v1/expert/mcp/tools, which proxies that endpoint and returns the merged catalog: FlowFuse platform tools are curated into the same array (tagged as a platform group) — wired and commented out until the platform-tool work is merged, at which point it's a one-line switch. Every chat response carries a hash of the flow-building catalog; the browser refetches only when the hash diverges, so it stays correct across rolling deploys where instances can be on different versions.UI
The settings panel follows existing FlowFuse patterns:
FormHeadingfor section titles,ff-data-tablefor the defaults and tool lists,ff-accordionto collapse the per-tool detail, and the shared three-button toggle for each policy control, so each tool name lines up with its own control across the row border.Out of scope (follow-ups)
Testing
forge/ee/routes/expert/index_spec.js(newMCP tools Endpointblock):GET /mcp/toolsauth (401 for instance/device tokens), team-access (404 for non-members), missingteamId(400 from the querystring schema), the flow-tools catalog + hash proxy (asserting the upstream/mcp/flow-toolsURL and service token), the empty-response defaults (catalog: [],hash: null), and upstream error-status propagation.frontend/src/stores/product-assistant.spec.js(new tool-permissions block): the permission-resolution engine, i.e. theclassOf/groupOfhelpers, per-team class defaults, saved vs. session policy resolution,resolvedToolPermissions, version gating (toolAvailabilityFor), catalog/preference/override mutations,resetGroupClassPreferences,promoteSessionOverride, and the pending-approval registry.frontend/src/stores/product-expert-tool-permissions.spec.js(new): catalog fetch (success / no-team / error) and the approval round-trip (session short-circuit, resolve, always-allow, always-deny, cancel).Requires matching agent-side changes.
Refs FlowFuse/product#421
Screenshots
Expert Settings with Tools permissions
Counter of how many tools have different permissions than their scope's permission
Permissions reset and change behaviour per scope
Screen.Recording.2026-07-01.at.17.24.46.mov
Option to save a permission set for the current session
Screen.Recording.2026-07-01.at.17.32.09.mov
Approval Cards
Screen.Recording.2026-07-01.at.17.30.45.mov